THE RETRIEVAL PHASE OF THE HOPFIELD MODEL: 
A RIGOROUS ANALYSIS OF THE OVERLAP DISTRIBUTION* 



Anton Bovier 1 

WeierstraB-Institut 
fur Angewandte Analysis und Stochastik 
MohrenstraBe 39, D-10117 Berlin, Germany 

V~> Veronique Gayrard 2 

_j Centre de Physique Theorique - CNRS 

j ) Luminy, Case 907 

1-8 F-13288 Marseille Cedex 9, France 

(N 



^ Abstract: Standard large deviation estimates or the use of the Hubbard- Stratonovich transfor- 

mation reduce the analysis of the distribution of the overlap parameters essentially to that of an 

ON explicitly known random function $jv,/3 on IR M . In this article we present a rather careful study of 

the structure of the minima of this random function related to the retrieval of the stored patterns. 
S We denote by m*(/3) the modulus of the spontaneous magnetization in the Curie- Weiss model and 

by a the ratio between the number of the stored patterns and the system size. We show that there 
Q exist strictly positive numbers < j a < j c such that 1) If ^/a < 7 a (m*(/3)) 2 , then the absolute 

minima of $ are located within small balls around the points ±m*e M , where e M denotes the /x-th 
unit vector while 2) if *J~a < 7 c (m*(/3)) 2 at least a local minimum surrounded by extensive energy 
barriers exists near these points. The random location of these minima is given within precise 
bounds. These are used to prove sharp estimates on the support of the Gibbs measures. 

Keywords: Hopfield model, neural networks, storage capacity, Gibbs measures, self- averaging, ran- 
dom matrices 
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I. Introduction 

Over the last few years the so-called Hopfield model of an autoassociative memory [Ho], origi- 
nally introduced by Figotin and Pastur [FP] as a simplified model of a spin glass, has emerged as one 
of the more interesting models for spin systems with strongly disordered interactions, (for a survey 
mathematical results on this model and related topics, see the lecture notes of Petritis [P]). In a 
series of recent papers we have, partly in collaboration with Pierre Picco, obtained a fairly complete 
understanding of the thermodynamic properties of the Hopfield model in the regime there the ratio 
of the number of patterns M(JV) and the number of neurons, N, tends to zero [BGP1,BG2], and 
even if lim^ = a > 0, for very small a, we have been able to prove the existence of disjoint Gibbs 
states corresponding to the different patterns at sufficiently low temperatures [BGP2]. Technically, 
this relied on the analysis in some way or the other on large deviation estimates for the distribution 
of the overlap parameters. 

The purpose of the present note is to present a more refined analysis of these large deviation 
estimates intended for a more detailed investigation of its critical points and its behaviour near 
them in the case where a is strictly positive, though small. These are relevant not only for the 
analysis of the Gibbs states (where only the absolute minima are important) but also for the 
characterization of the long-time characteristics of the stochastic retrieval dynamics of the system. 
From numerical experiments and the replica heuristic it is expected that local minima of the "free 
energy functional" persist for considerably larger values of a than those for which they are absolute 
minima [AGS]. The 'storage capacity' is usually defined as the maximal value of a for which the 
local minima near the patterns exist. Newman [N], in a seminal paper of 1988 has proven a lower 
bound for the critical a for zero temperature (see also [KPa]). One of the main results of the present 
paper is an extension of this finding to positive temperatures. In particular, we give estimates on 
the behaviour of the critical a as a function of the temperature that show the expected power law 
behaviour near T = 1. Furthermore, we will compute rather precisely the exact (random) location 
of these minima and we will show that, for T not too small, the rate function near the location of 
the original patterns is locally convex, implying that there exists a unique local minimum near the 
patterns. Moreover, we will show that the only macroscopic component of the overlap vector at 
the minima is (at T ~ 0) shifted down from one by a term of order exp( — l/(2a)), as predicted in 
[AGS]. 

Let us recall the definitions of the Hopfield model and the main quantities of interest. Let 
S N = {-1,1}^ denote the set of functions a : {1, . . . , N} -> {-1,1}, and set S = {-1,1} W . We 
call a a spin configuration and denote by Oi the value of a at i. Let (f2, T , IP) be an abstract 
probability space and let £ IN, denote a family of independent identically distributed 

random variables on this space. For the purposes of this paper we will assume that iP[£f = ±1] = \, 
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but more general distributions can be considered. We will write £^[u>] for the JV-dimensional random 
vector whose i-th component is given by [u>] and call such a vector a 'pattern'. On the other 
hand, we use the notation £i[u>] for the M- dimensional vector with the same components. When 
we write £[u;] without indices, we frequently will consider it as an M X N matrix and we write 
£'[w] for the transpose of this matrix. Thus, is the M X M matrix whose elements are 

^2iLi ^fM^M- With this in mind we will use throughout the paper a vector notation with (•, •) 
standing for the scalar product in whatever space the argument may lie. E.g. the expression (j/, £j) 
stands for Y^f=\ £iVn, etc - 

We define random maps m^[u;] : Sn — > [—1,1] through 1 

Naturally, these maps 'compare' the configuration a globally to the random configuration £ M [w]. A 
Hamiltonian is now defined as the simplest negative function of these variables, namely 

H N [ U ](c)=-- £ K>](a)) 2 

*l=i (1.2) 
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N 

— \\m N [u](a)\\ 2 



where M(JV) is some, generally increasing, function that crucially influences the properties of the 
model. || • ||2 denotes the ^2- norm m IR M , and the vector mjv[w](cr) is always understood to be 
M(iV)-dimensional. 

Through this Hamiltonian we define in a natural way finite volume Gibbs measures on Sn via 

^ i/9 H( ff )=-J— e*MW (1.3) 

and the induced distribution of the overlap parameters 

Sat,/3N = Mat,/3N mjvH" 1 (1.4) 
The normalizing factor Z^ t p[oj], given by 

Z NiP [u] = 2~ N £ e -PH N W(<r) = ffire -^HW (1.5) 

<r£S N 

is called the partition function. We will frequently consider the non-normalized probabilities that 
mjv(<r) lies in a ball in IR M of radius p centered at m, 

Z N , PiP [u]{m) = iE CT e"' 3 ^^( CT )l[ { || mjvM(CT) _ m || 2 < p} (1.6) 



1 We will make the dependence of random quantities on the random parameter U) explicit by an added [u)~\ 
whenever we want to stress it. Otherwise, we will frequently drop the reference to U) to simplify the notation. 
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We are interested in the exponential asymptotics of these quantities, i.e. in the behaviour of the 
functions 

/jv,/3,pN(m) = -j^ In Z Ni /3 iP [u](m) (1.7) 

and in particular in the location of the critical points of these functions when N tends to infinity, 
since these determine not only the asymptotic properties of the Gibbs measures, but also the long- 
time features of a stochastic dynamics (the so-called "retrieval dynamics") chosen such that the 
Gibbs measures are their equilibrium distribution. 

A study of these functions has been undertaken in a number of previous papers, using either the 
so-called Hubbard- Stratonovich transformation [FP,K,BGP1], or standard large deviation estimates 
[BG2]. In the Hubbard- Stratonovich approach, one considers instead of the measure Qn,/3 itself 
its convolution with a Gaussian measure on IR M of mean zero and variance (/3JV) _1 H (where H 
is the identity matrix). The resulting measure Qn,/3 is absolutely continuous and has a density 
proportional to 

exp(-#^H(z)) (1.8) 

with respect to M- dimensional Lebesgue measure. The function §N,p( z ) can be computed explicitly 
and is given by 

1 1 N 

*^M(*) = 2INI2 " ^E lncosh ^' z ) (1-9) 

The results obtained in [BGP1,BGP2] on the concentration of the limiting Gibbs measures were 
based on an analysis of the location of the absolute minima of the function $at,/3- One may 
notice that the measures Qn,/3 an d Qn,/3 are related by a convolution with a measure that is, 
asymptotically as N j 00, concentrated sharply on a sphere of radius -^/a//3. 

This allows to recover localization properties of the measure Qn,/3 u P to that precision from 
those of Qn,p- An alternative approach using standard large deviation estimates can also be used 
(see [BG2]) and reveals that as far as the analysis of the critical points of fN,p,p( m ) is concerned, 
this also boils down to the study of the same function $at,/3- Notably, the lower large deviation 
estimates can be obtained only for p > \/2a, so that in this way virtually the same precision on 
localization properties is obtained, and both approaches seem practically equivalent and may be 
used alternatively according to what appears more convenient in a given situation. 

We see that in any case, further progress relies on better estimates on the behaviour of this 
function and it is the purpose of the present paper to provide a considerably more precise analysis of 
them then those given in [BGP1]. In particular we get (up to constants) the conjectured behaviour 
of the critical temperature as a function of a, for a small. Let us formulate our main results. We 
denote here and in the sequel by m*(/3) the largest solution of the equation m = tanh(/3m). Note 
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that m*(/3) is strictly positive for all (3 > 1, lim^joo m*(/3) = 1, and lim^i ]] = 1- Let us 
denote by B p (x) the ball of radius centered at x in iH^. We denote by e M the /x-th unit vector 
m IR M . We will see that the relevant small parameter in our problem is always the ratio between 
^/a and (m*(/3)) 2 . We will therefor use the general convention to set ^/a = 7(m*(/3)) 2 and we 
will treat 7 as our small parameter. Our main results can then be summarized in the following 
theorems (which however do not contain all the precise estimates on constants that can be found 
in the later sections). 

Theorem 1: There exists j a > such that for all (3 > 1 for a < 7 2 (m*(/3)) 4 there exists 
constants Co < 1/2, c\ > such that IP -almost surely for all but a finite number of indices N for 
all me |U( M , S ) 5 Co7m .(sm*e^)| , 

*tf,/9M(m) - $ iVi/3 [w](m*e 1 ) > c 1 {m*f inf ||m - sm*e> 1 \\ 2 2 (1.10) 

Theorem 2: Let z^ G LR M ^ N ^ denote the random vector whose v-th component is z\f^ = 
jf E l i Ii titi> tf v / A* and = °- T/iere e2; * sis 7c > suc/i that for all (5 > 1 /or a < 7c 2 (m*(/3)) 4 
there exists strictly positive constants 02,03 such that IP -almost surely for all but finitely many N , 
for all v such that \\v\\2 < c^m* and 



z 



■(f). 



m*(/3) 



0(1 -(m*) 2 ) 



> c 2 m*(/3) 7 3 / 2 ^b^ (1.11) 



then 



$jv,/3M(m*e" + w) > inf $j Vi/3 [c«;](m*e' 1 + w) (1.12) 

||w||2<c 3 m* 



We obtain bounds on the various constants in the different asymptotic regimes in the course of 
the proofs. Our bound on the constant j c will be considerably larger (of order 0.04 for /3 large) 
than the one for j a (of order 10~ 4 ), in accordance with the general expectation that the local 
minima corresponding to the patterns persist for values of a where they are no longer the absolute 
minima. Let us remark that a very similar analysis could also be carried out to prove the existence 
of further local minima associated to so-called "mixed states" (see e.g. [N]), but we leave this to 
the interested reader. 

As a consequence of the previous theorems and the estimates entering their proofs we get the 
following theorem on the Gibbs measures. 

Theorem 3: For all (3 > 1 and a < 7 2 (?ti*(/3)) 4 there exists a constant C5 < such that 

Urn N (J 5 CB7m .(se"m*) \\ = 1, IP- a.s. (1.13) 



Moreover, for any pair of indices fJ,,v, 



lim — In 

JVf oo N 



IP - a.s. 



(1.14) 



Remark: Theorem 3 sharpens the results of [BGP1] and [BGP2]. (1.14) guarantees that limiting 
measures concentrated on a single ball can be constructed by applying an magnetic field aligned 
on one of the patterns whose strength can be taken to zero after the limit N j oo is taken. See 
[BGP1] for a general discussion on limiting Gibbs measures. In a recent note [T2] Talagrand has 
announced an estimate similar to (1.13) under some additional restrictions on /3. 

The remainder of this paper is structured as follows. The next section introduces a new very 
sharp bound on the behaviour of the maximal eigenvalue of the random matrix While we 

believe that this result has some interest in itself in that it provides considerably sharper bounds 
than were previously available (the sharpest ones, to our knowledge, being due to Shcherbina and 
Tirozzi [ST] were of the order exp( — JV 2 / 3 ) only), this introduces some of the basic 'new' techniques 
in a rather simple situation and can thus be seen as a warm up for what will follow. In section 
3 we improve the estimates of [BGP1] by locating more precisely the absolute minima of $jv,/3 
for very small a. Section 4 is the central part of this work. Here we control the precise location 
of the local minima corresponding to the patterns and control the behaviour of $jv,/3 near them. 
The main difficulty we have to overcome here is that the function $jv,/3 is random. The usual way 
to get precise estimates on a function near its minima is to use a Taylor expansion. Due to the 
randomness, there can be no uniform control over the remainder terms, but we have to deal with 
the probabilities of large excursions. To estimate those, we need to control suprema of certain 
random processes that are indexed by continuous parameters taking values in high- dimensional 
sets. In this analysis we invoke techniques introduced in the analysis of the regularity of random 
processes in Banach spaces (see [IT]). This rather long section is subdivided into three subsections: 
In part 1 we prove the uniform upper and lower bounds on $. In part 2 these are used to localize 
the position of the minima. Here we also prove the local convexity of $. In part 3 we localize the 
value of the unique macroscopic component of the position of the minima and show that in the 
limit P | oo it differs from one by an term proportional to exp( — l/2a). In Section 5 we apply 
the previous estimates to prove Theorem 3. An appendix contains the proof of a technical lemma 
needed in Section 4.3. 

Acknowledgements: We thank Michel Talagrand for sending us a copy of [Tl] through which 
we learned about Theorem 2.5. We are grateful to Barbara Gentz for helpful comments on earlier 
drafts of this paper. A.B. thanks Dmitry Ioffe for useful discussions on the proof of Lemma 4.18. 
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2. An exponential bound on random matrix norms 

As a technical warm-up for what is to come, as well as a basic input for the remainder, we 
will show how techniques of the types used in the analysis of random processes (for an exposition 
see e.g. [LT]) and concentration of measure estimates (we refer explicitly to the recent paper [Tl] 
by M.Talagrand) can be used to get exponential bounds on the maximal eigenvalues of random 
matrices that are relevant for our analysis. Note that subexponential bounds have been known for 
a long time and were generally used in our previous analysis [ST,K,BG2,BGP1]. 

We are interested in the matrix An = (To simplify notation, we will frequently drop the 
index N and write A for the matrix A^ in the generic dimension N). We begin with the simplest 
a-priori estimate on the corresponding quadratic form: 

Lemma 2.1: For any non zero x £ LR M and for all c > 

IP [(x,A N x) > (1 + c)||cc||l] < exp j-y (c-ln(l + c))| (2.1) 



Proof: We simply use the exponential Chebeychev inequality and the Hubbard Stratonovich 
transformation [HS] to see that 

2 



P [(x,Ax) > (1 + c)||a;||l] = IP 



> 1 + c 



»=i 



< inf e -<i+c)N 

0<t<l/2 

< inf e -<i+c)N 

0<t<l/2 



1 



N 



(2.2) 



N 



Now the infimum over t in the last line of (2.2) is taken on for t = \^^. and inserting this value in 
(2.2) yields (2.1). 



Let us now introduce a family of grids Wm,t m IR with spacing ^j^- We denote by Wm,t{p) 
the set of points x £ WM,r such that \\x\\2 < p. We have 

Lemma 2.2: Let B T {x) denote the ball of radius r centered at x. Then 
(ii) \Wm,t(p)\ < e M ( ln + r+ c ) i for some constant c < 1. 

Proof: Statement (i) follows since the length of the diagonal in a M-dimensional cube of side 
length equals r. Statement (ii) reflects the fact that the volume of a ball of radius p in IR M is 
(2 P m tt m ) /(MT(M/2)). 



We will control the norm of the matrices A by using the definition of the matrix norm 



\\A\\ = sup (x, Ax) (2.3) 

x: ||:e|| 2 = 1 

To estimate the probabilities of suprema over continuous sets of random variables, we will employ 
a technique used by Ledoux and Talagrand for instance in their textbook [LT]. To this end we fix 
a number a < 1 to be chosen later and chose a sequence r n = a n . Then any x with norm one can 
be written in some (possibly non-unique) way as 



n +1 

X = 

n=l 



E x W' ( 2 - 4 ) 



where x(n) £ WM,r n ( r n-i) f° r n < n* and ||a;(7i* + 1 ) 1 1 2 < r n ». We will abbreviate for simplicity 
W(n) = W M ,rA r n-i)- Tms g ives tnat 



sup (x,Ax) = sup ... sup sup 

x:||x|| 2 = l x(l)eW(l) x(n*)GW(n*) x(n* + 1): ||x(n* + 1) || 2 <r, 



(E^w^E^n) ( 2 - 5 ) 



To make good use of this formula, the following elementary lemma is of great help: 

Lemma 2.3: Let b ni n > 1, be any absolutely summable sequence of real numbers. Then, for 
all q 2 > 0, 

/ n +1 \ n 



Em <(i+? 2 )E( i +^ 2 ) n " 1 ^+( i +^ 2 ) n *^+i ( 2 - 6 ) 

\n=l / n=l 

Of course this formula is useful only if b 2 n (l + q~ 2 ) n ~ 1 is summable. 



Proof: The proof of this lemma follows by induction from the elementary observation that for all 
q 2 > 0, first 

2x = 2qq~ 1 x = q 2 + q~ 2 x 2 - {q - q~ X x) 2 < q 2 + q~ 2 x 2 (2.7) 

and whence 

(b + c) 2 = b 2 (l + 2 C -+ C ^) f 

V b b 2 ) (2.8) 

< & 2 + c 2 + q 2 h 2 + g -2 c 2 = ft 2 (1 + q 2j + £ 2 (1 + q -2j 



Lemma 2.3 allows us to write, for q to be chosen later, that 

n* 

sup (x, Ax) < (1 + q 2 ) E(! + ?~ 2 ) n ~ 1 sup (x(n),Ax(n)) 

xeJR M :||x|| 2 = l n=1 x(n)GW(n) ^2.9) 

+ (l + q- 2 ) n * sup (x(n* -\- 1), Ax(n* -\- 1)) 

x(n* + l):||x(n*+l)|| 2 <T- TO » 
7 



But combining (2.1) with (ii) of Lemma 2.2, we get that 



IP 



where we have set 



Therefore, 



sup (x(n), Ax(n)) > (1 + c)a 2(n_1) 

x(n)GW(n) 



7(c) = Hc-ln(l + c)} 



< e M(|lna| + l)-ATg(c) 



(2.10) 



(2.11) 



IP 



£(1 + ? 2 )(1 + g" 2 )"- 1 sup (^n), Ac(n)) > (1 + c)(l + g 2 ) £(1 + ^^"V^- 1 ) 

x(n)GW(n) 



.n=l 



n=l 



n=l 



sup (cc(tc), Acc(tc)) > (1 + c)||cc(tc)||2 

x(n)GW(n) 



(2.12) 



< TC * e M(|ln + a| + l)-JVg(c) 

On the other hand, it is a trivial matter to see that uniformly, 

(1 + q~ 2 ) n ' (x(n* + 1), Ax(n* + 1)) < M(l + q~ 2 ) n * \\x(n* + 1)|| 2 < M((l + q~ 2 )a 2 ) n * (2.13) 



We thus obtain, combining our estimates, 



IP 



(l + <? 2 ) 



sup (x, Ax) > (1 + c) <| ^ ' V, + M((l + g - 2 )a 2 )" 

|x|| 2 = l I 1 _ l 1 + <? J a 



< TC * e M(|lna| + l)-JVg(c) 



(2.14) 

Of course the constants q and a have been assumed to satisfy (1 + q~ 2 )a 2 < 1. It remains now to 
choose these constants as well as n* . Without attempting a strict optimization, a reasonable choice 
turns out to be, for ^/a < 1/2, 

(2.15) 



a = va 



With this choice, the remainder term (2.13) is bounded by l/N if n* = ln(MJV)/|ln (f) |. If we 
moreover set c = g~ x (a(|lna|/2 + 1) + e), e > 0, (2.14) finally gives 



IP 



sup (x, Ax) > j 1 + ^ + 1 ) (i + g -i ( a (| l n a |/ 2 + 1) + 6 )) 

:||x|| = l I 1 - « - V" N ) 



< HMN) c _ eN 



|ln(|) 



(2.16) 



This bound is not very good to determine the true norm of A, but it gives very good estimates 
on probabilities of very large excesses. We will now bootstrap this result with the help of a general 
'concentration of measure' theorem of M. Talagrand [Tl]. To this end we need the following 
properties of the norm of A as a function of £. 



Lemma 2.4: Set Xn(uj) = sup x .|| x |i =1 (a;, An[cj]x). Then 
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(i) The function Ajv(w) is a convex function of the random variables £(u>). 
(ii) Ajv(w) satisfies the Lipshitz bound 

V2 



\X N (u) - X N (u')\ < ^=^\ N (cj) + \ N (cj')\\a^) - £(u>')\\ 2 
V N 



(2.17) 



Proof: To prove (i), note that (x, Ax) = Y^ii£ii x ) 2 1S a convex function of £ for fixed x. But 
the supremum of a family of convex functions is again convex. 



To prove (ii), note first that 



sup (x,A{uj)x) — sup (x,A{uj')x) 



x: \\x =1 



x: \\x =1 



But 



\(x,A(lj)x)-(x,A(lj')x)\ 



< sup I (as , A[uj)x) - (x, A[uj')x)\ 

x: ||a:|| = l 



(2.18) 



(2.19) 



Now 



while 



^ E(£H + *) 2 < f *) 2 + 1 *) 2 



(2.20) 



jf £(£H - *) 2 ^ ^ E £(£H - £( w ')) 2 E < = jfWtw - (2-2i) 



from which (2.19) follows.^) 

Theorem 2.5: ([Tl]) Let f be a real valued function defined on [—1, 1]^. Assume that for each 
real number a, the set {/ < a} is convex. Suppose that on a convex set S C [-1, 1]^ the restriction 
of f to B satisfies for all x,y £ B 



\f(x)-f(y)\<l B \\x-y\\ 2 



(2.22) 



for some constant Ib > 0. Let h denote the random variable h = f(Xi, . . . ,X^).Then, if Mf is a 
median of h, for all t > ; 



^- M '"^ 46+ T^ exp (-I^) 



(2.23) 



where b denotes the probability of the complement of the set B. 

We see that due to Lemma 2.4 we are exactly in the situation where we may apply this theorem 
with h being the norm of A. 

This gives us the following 

Theorem 2.6: Assume that a < 1/4. Then there exists a constant K = K(ct) < oo such that 
for all x < 1 

IP [\\\A\\ -IE\\A\\\ > x] < Ke'^- (2.24) 
The same result holds for A replaced by A — I. 

Remark: From Theorem 2.5 we get an exponential estimate on |||A|| — My^yl. But it is easy 
to see that this together with (2.19) in turn implies the exponential estimate (2.24) (with slightly 
modified constants). 

From the known standard estimates on the eigenvalues of A (the first reference to our knowledge 
is [Ge]) we know that the median of \\A\\ equals (1 + \fci) 2 an d that of \\A — H|| equals + a, 
up to corrections that tend to zero with N rapidly. 

Proof: Theorem 2.6 is a direct consequence of Lemma 2.4 and Theorem 2.5, together with the 
estimate (2.16), used for some suitable small value of e. Since Lemma 2.4 holds also for the norm of 
A — H, we get the same estimate for the norm of that matrix. The constant K(ct) can be estimated 
more precisely from our bounds, but its value will be of no particular importance for the rest of 
this paper. Q 

Theorem 2.6 will be used heavily in the remainder of this paper. We introduce, for future 
reference the sets 

fii(JV) = {cu £ n\\\A N [cu] - 1\\ < rjv(a)} (2.25) 

and 

fti = Ujv >i njv>jv M N ) ( 2 - 26 ) 

where rjv(a) = 2\/2 + a + e, for some arbitrarily small e (one may also take e that decrease with 
N, e.g. e = Cyin N/N). Then one has that iP[fii(iV)] > 1 - Kexp(-Ne 2 /K) and iP[fii] = 1. 
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3. Global minima 



In this section we determine a regime in the a,/3 plane for which global minima away from the 
Mattis states can be excluded. This will provide a more transparent proof and better estimates 
on the parameters than previously obtained in [BGP1]. In particular, it will yield the correct 
asymptotic behaviour of the maximal allowed a for /3 J, 1 which agrees (up to constants) with the 
findings from replica methods [AGS]. 

We first introduce the following subsets of IR M : 

T e = {m | 1 1 1 ttt, 1 1 2 — m*| > em*} (3-1) 

and c 

^P,e = ^n| U B p {sm*e»)\ (3.2) 

where the union runs over all (/x, s) £ {1, . . . ,M} X { — 1, 1} and where B p (m) denotes the ball of 
radius p centered at m. 

The central result of this section is the following theorem. 

Theorem 3.1: There exists strictly positive constants j a , c\, C2,c such that for all < a < 
7^(m*(/3)) 4 there exists a set f2 C with IP[Q C ] < e~ ClN and a constant < C4 < ^, such that for 
all u> £ Q the function $jv,/3[w](m) satisfies the following: 

(i) For all m £ Ti/ 35 

*^H(m) - </>(m*) > \c{m*f (||m|| 2 - m*) 2 (3.3) 

and 

(ii) For all m £ D Cim , tl/35 , 

*^H(m) - </»(m*) > c 2 (m*) 4 (3.4) 
In particular, all absolute minima of $ lie in the union of the balls B CiTn * (±m*e M ). 

Remark: The proof of this theorem provides estimates on the constants that we have not tried 
to optimize. The interested reader is invited to do this. The relation between the critical a and 
P — 1 show however the expected correct power-law behaviour. Note that asymptotically, as /3 J, 1, 
m*(/3) 2 « 3(/3 - 1). 

Proof: Let us first give a brief outline of the proof. We will treat separately the regions T e , -D e ,i/2 
and the balls 5 1 / 2 ('Sm*e M ). On the first two sets we will use that on the set Qi defined in (2.26), 

*(m) - </>(m*) > - r -¥\\m\\ 2 2 + £ £ (</>((&,m)) - </>(m*)) (3.5) 



and prove a suitable lower bound on 

-*(»»*)) (3-6) 

i 

To treat the balls Bi/2(sm*e ti ) we will, performing the change of variables m = sm*e ti + v, 
use that on Qi 

*(m) - </>(m*) > - r -¥\\v\\ 2 2 - m*r(a)\\v\\ 2 + £ £ (*((&,m)) - </>(m*)) (3.7) 

to show that 

i 

1 1 ^ 1 1 2 > c '(ff)- , 'r(a)/2 then guarantees $(m) > </>(m*). Of course this requires c'(/3) > r(a)/2. 

We start with the following preparatory lemmas 
Lemma 3.2: Let 

lncoshfiflm*) 1 

Then for all ft > 1 anc? /or z 

</>0) - </>(m*) > c(/3)(|z| -m*) 2 (3.10) 
Moreover c(/3) tends to ^ as /3 | oo ; anc? behaves like ^(m*(/3)) 2 ; as /3 J, 1. 

Proof: Notice that the function </>(z) is symmetric and has the property that zcj)"\z) > 0. Consid- 
ering only the positive branch, we see that the constant c(/3) was chosen such that equality holds 
in (3.10) at the points and m*. To show that this implies that the quadratic function is a lower 
bound is an exercise in elementary calculus. 

The asymptotic behaviour of c(/3) follows form the fact that for small argument, In cosh x ~ 
x 2 /2, while for large arguments In cosh x ~ \x\. Q 

Lemma 3.3: OnQ 1: 

(i) If ||m||2^/l — r(a) > m* , then 

N 2 

jr E * (&> m )) " <^ m *) ^ (ll m ll 2 V 1 " r ( Q ) " m *) ( 3 - n ) 

anc? 
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(ii) if \\m\\2^/l + r(a) < m* , then 

£X>((£,m))-#m*) > c(/3) (||m|| 2V /l + r(a) - m*) (3.12) 

Proof: Using Lemma 3.2, we see that 

AT AT 
i=l i=l (3.13) 

where we used the Schwarz inequality. From here the lemma follows by using the bounds on the 
norms of the random matrices £'£/-AT established in Section 2. Q 

Corollary 3.4: There exists a constant c\ > such that if \fa < ci(m*) 2 ; then there exists 
e = e(a) ~ y / 2y / a/c(/3) such that if m G T e i/ien /or a; G Oi, 

*^H(m) - </>(m*) > |c(/3)(||m|| 2 - m*) 2 (3.14) 



Proof: This follows from the preceeding lemma by elementary algebra. () 

This concludes our treatment of the region T e . The case of the region -D e ,i/2 and the balls 
5 1 / 2 ('Sm*e M ) will be more involved. In particular, we will get a priori only probabilistic versions 
of the analogs of Lemma 3.3, and thus we will have to estimate probabilities of suprema over m of 
our functions </>(m). Our first observation is thus that the function $(m) is Lipshitz continuous on 
Qi which will allow us to reduce the problem to an estimate of a lattice supremum. We have 

Lemma 3.5: For all cu G Oi and for all (3, 

|*(m) - *(m')| < ^l m ll 2 + H m II 2 + y/l + r (a)j \\m - m'|| 2 (3.15) 

Proof: The proof of this lemma consists just in some applications of the Schwarz inequality. Note 
that of course 

\\m\\l — \\m'\\l = (m + m',m — m')<||m + m'||2||m — m'\\2 (3.16) 
On the other hand it follows from the mean value Theorem in IR M that for some < 9 < 1, 
^ £ cosh(/3(&, m)) - In cosh(/3(&, m'))) 

i 

= Jr m ~ m ') tanh(/3(6, m' + 0(m - m'))) ^ ^ 



< \\m - m '\\ 3 y/Mlp. tanh 2 (/3(&, m> + 6{m - m'))) 
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Using the inequality | tanhcc| < 1 and the bound on the norm of £'£/./V on Qi, we arrive at (3.15). 



Remark: The bound (3.15) is actually quite poor and can be improved considerably, in particular 
for m and m' near the critical points of $ and if /3 is near one. We leave this as an exercise to the 
reader. We can live with this simple bound on the expense of choosing a smaller lattice spacing, 
and this does not substantially deteriorate our results. 

Lemma 3.6: Let Xi > 0, i = 1, . . . , N be positive i.i.d. random variables that satisfy L°[Xi > 
z] > q. Then for all (>0, 



IP 



N 



i^Xi< g z(l-C) 



i=l 



< exp < — Nq 



(3.18) 



Proof: By the exponential Markov inequality we have that 

N 



IP ±^'Xi<qz(l-C) < infe*** 1 -^ [JEe-** 1 ]" 

<infe t ^ 1 -^ iV [l + g ( e -"-l)l" 
~ t>o L J 

Choosing t = e/z and using the inequality 

In [1 + g(c"f - 1)] <-q(+2f 

one obtains (3.18).^) 

Lemma 3.6 will be used together with the following observation. 

Lemma 3.7: Let 1 < t < M be a fixed integer. For any m £ LR M set 

m = (mi, . . .,m t ,0, . . .,0) 



m = (0, . . .,0,m t+ i, . ..,m M ) 



Then, for any < d < 1, 



IP W(6.m)) - «m') > cWWl 1 ] > k ~ 7 JHI5 + ~ 



2 4 (l-d) 2 (m*) 2 



where c(/3) is the constant from Lemma 3.2. 



(3.19) 



(3.20) 



(3.21) 



(3.22) 



Proof: Let us put X = (m, £i) and Y = (m, ^i). Note that X and Y are independent and 
symmetric random variables. By Lemma 3.2 we have that 



IP [^((fi,m)) - </>(m*) > c((3)(m*) 2 d 2 ] > 1-IP[\\X + Y\ - m*\ < dm*] 
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(3.23) 



Now 

IP [\\X + Y\ - m*\ < dm*] < IP [\X + Y \ > (1 - d)m*] 

By the symmetry of X and Y, 

IP[\X + Y\ > (1 - d)m*] = \IP [\X + Y\ > (1 - d)m*\X > 0,Y > 0] 

+ \IP [\X + Y\ > (1 - d)m*\X > 0,Y < 0] 
< \ + \IP [\X + Y\ > (1 - d)m*\X > 0,Y < 0] 

For the last inequality we finally use the Chebychev inequality. This gives 

IEX 2 + IEY 2 - 2IE\X\IE\Y\ 



(3.24) 



(3.25) 



IP [\X + Y\ > (1 - d)m*\X > 0,Y < 0] < 



(1 - d) 2 {m*) 2 



(3.26) 



The announced result follows from here by the Khintchine inequality [Sz], which tells us that 
IE\X\ > ||m|| 2 /V2. 

To make use of this lemma, one has to choose t in the decomposition (3.21) in such a way 
that ||m||2 and ||m||2 are as similar as possible. We may suppose without loss of generality that 
m i > \ m 2 \ > • • • > | m M"|- Then the conditions |||m||2 — m*\ < em* and \\m— e^m*^ > \m* imply 
that 



m i < ( 5 



m 



Without loosing anything, we can choose t = 1 as long as 



m 1 >m*J(l-e) 2 -U + e+^ 



(3.27) 



(3.28) 



This gives the bound 



*\2 



|™.||2 - ll^lb) < ("i*) 



8 + £ + T 



(l- e )2_(I + e+ £) = 9 (e)(m*) 2 (3.29) 



If mi is smaller than the value given in (3.28), then we must choose t larger. The point here is that 
we can always find a t for which 

||m||2 — W'rhWl ^ m i (3.30) 

and this implies, for these values of mi, an even smaller bound on (||m||2 — ||m||2) 2 than g(e)(m*) 2 . 
Combining now (3.22) and (3.29) with Lemma 3.6, we arrive at the bound 



IP 



jf E m )) " <^ m *) ^ KP)(m*fd 2 



1 _ (l + ef + g(e) 

2 4(l-d) 2 



(i-0 



(3.31) 



< exu I -N I i - ( 1+£ ) 2+g ( £ ) 

- eX P \ JV I 2 4(1 -d) 2 
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for arbitrary positive (. This bound looks somewhat complicated, and it is most reasonable to 
make a choice for e and d. Numerically, it turns out that if we fix e = ^ and d a 0.102, then (3.31) 
gives us the desired 

Lemma 3.8: For all m £ D m * /2,i/35 and for any ( > 



IP 



1 ,,,, \ \ *n c(/3)(m*) 2 (l - () 



35 2 



< exp 



c 2 

-JV — 
32 



(3.32) 



We are left to treat the case of the balls 5 1 / 2 ('Sm*e M ). W.l.g we will consider the ball 
Bi/ 2 {m*e 1 ). We will prove: 



Lemma 3.9: Assume that m £ B 1 / 2 {m*e 1 ). Then, 



IP 



jj E m )) - <^ m *) ^ t 1 - r ( a ) - °- 5 ] H m - elm *ii2 l n i ^ ex p (-§) 



(3.33) 



^ere c(/3) = ^^ft" (* ^ A {m*f , for (3 near I). 
Proof: Like in Lemma 3.2 it is clear that for z > — |m*, 



*\2 



(j){z) - (j){m*) > c{(5){z-m*) 



(3.34) 



if c(/3) is chosen such that the parabola on the right intersects the function on the left at z 
— 3m*/4. Thus we can use that 



" E ^S.H^^X- 3 ™*^} m ) " m *) 2 

But since on fii, ||A — H|| < 7"(a), 

£ E ((&> m ) - £ m *) 2 = (( m - mV )> ^r( m " mV ) 



(3.35) 



> m — me 



N 



(3.36) 



> \\m- m*e 1 \\ 2 2 (l - r(a)) 



So that all we have to estimate is 



IP 



(3.37) 
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where we set m) = m* + v + (m, m = m — m 1 e 1 and £^ = . 

We now use the exponential Markov inequality and estimate the Laplace transform 



iEexp 



{(fi.AX-Im'-.} + 



(3.38) 



A rather straightforward computation with the choice t = 2 \\^ l \\ 2 gi yes 



IEexp 
< 1 + 



2||m||2 {(£i»<-4 m 



exp 



2||m|| 



(l m * + v) 2 - g dm*)' 



(3.39) 



Since under our assumptions ||m||2 +v 2 < j(m*) 2 and ||m||2 + (m* + v) 2 > (1 — ^) 2 {Tn*) 2 , we have 
the bounds v > - [| + |[(1 - j^) 2 - 1]] m* k -0.096836m* and \\m\\ 2 2 < (m*) 2 /4, which gives 
with g=\, 



4|| m ||2 {(£i,m)<- T , 



iEexp „ I r 

Using that ||m — e 1 ?™*!^ > ||"i||2) we get from here 
IP 



< 1.10778 



(3.40) 



N X^mmX-fm*^-"} Uii>™') + V )) > y\\m-e 1 m*\\\ 



y\\m — e 1 ?™*!! 2 , 
< exp ( -N^ . 112 + JV0.1024 

4|| m ||2 

- - 0.1024 

4 



(3.41) 



< exp [-N 

Choosing y = 0.5 then gives the assertion of Lemma 3.7. 

We can now conclude the proof of Theorem 3.1. Note that statement (i) follows immediately 
from Corollary 3.4, if c\ is sufficiently small to allow us to set e(a) = and if c satisfies c(/3) > 
c(m*) 2 . 

Combining the estimates of Lemma 3.8 and 3.9, and choosing a constant C4 < 1/2, we get that 
if only ci (and thus t{ol)) is sufficiently small, 

IP [$(m) - </>(m*) > c 2 (m*) 4 ] < e~ & * N (3.42) 

for all m £ D Cl m* ,1/35 an d for some strictly positive constants £2 and £3. It remains only to extend 
this to an estimate of the supremum over all m £ -D C4 m*,i/35- Let us choose k > 2. Then as in 
Section 2, we find get immediately that 



IP 



sup $(m) - 4>{m*) > c 2 (m*) 4 

m e £> c 4 m * , 1 / 3 5 n w M i ^ it 



< e JV[a(fc|lna| + l)-c 3 ] 



(3.43) 



But by Lemma 3.5 the supremum over D Cl m* ,1/35 differs from the lattice supremum by no more 
than 2a k , so that the claim (ii) of the theorem follows by slightly adjusting the constants c 2 and 
c 4 - 00 
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4. Local minima of $ near the 'Mattis states' 



We will now show that the large deviation function $(m) actually has a quadratic behaviour 
in the neighborhood of the minima that correspond to the stored patterns. We already know that 
for very small a, the absolute minima of $ are located in the vicinity of these points. Here we will 
compute the location of the minima more precisely, and we show that they exist for much larger 
values of a than those for which our proof in the previous section worked. The proofs in this section 
use some of the methods introduced in Section 2. 



4.1. Upper and lower bounds on $ 

Let us for convenience consider the minimum at m^ 1 ' 1 ). We set 

m=e 1 m*+v (4.1) 
We recall the notation A = ^{/N and B = A - I. We may write 
*("*) = -! K Bm ) + Tr Yl m )) 

i 

= -\{v,Bv)-{v,Bm*e 1 ) + ±Y; ( l> ( m *^ + (&> v )) (4.2) 

i 

= -\{v, Bv) - m* (v, z«) + ± £ </> K + 



where = and for /x / 1, = ■ Here m* = m*(/3) is by assumption one of the 

values at which 4>(x) attains its minimum. We have the following result on the function </> 

Lemma 4.1: Assume that \z\ < rm* . Then, for all ft > 1, there exists a constant < c(/3,r) < 1 
such that 

</>(m* + z)- </>(m*) < y [1 - 0(1 - (m*) 2 )] (1 + c(/3,r)) (4.3) 



and 



</>(m* + z)- </>(m*) > y [1 - 0(1 - (m*) 2 )] (1 - c(/3,r)) (4.4) 



c(/3,r) satisfies lim^ c(/3, r) < r ^ 1 2 t " r ^ anc? lim^joo c(/3, r) = 0. 
Moreover, for all values of z, 

</>(m* + z) - (j){m*) > (4.5) 

and 

</>(m* + z)- </>(m*) < Z — [1 - (3 + (3 tanh 2 (0(m* + |z|))] (4.6) 
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Proof: Taylor's formula with remainder gives that 



2 ^3 
''/_*\ ±111 f z\ 



</>(m* + z)- </>(m*) - - </>"(m*) = <^'"(5)- 



(4.7) 



for some z £ [m* , m* + z] . Now 



</>"(5) = 1 -/3(1 -tanh 2 (/35)) 



(4.8) 



and 



</>'"(*) = 2/3 : 



tanh(/35) 
cosh 2 (/35) 



Since m* = tanh(/3m*) by definition, we get 



(4.9) 



2 z fl 2 tanh (ff£) 



(4.10) 



For P close to 1, a good estimate is 



2z(3 



2 tanh(/3z) 
cosh 2 (/35) 



6[1-/3(1- (m*) 2 



< 



1t(1 + t)/3 3 (m*) 2 
3 l _/3 + /3(m*) 2 



(4.11) 



Since a simple calculation shows that to first order in (3 — 1, (m*(/3)) = 3(/3 — 1), this gives the 
desired estimate in the case /3 J, 1. For /3 large, note first that m*(/3) j 1, exponentially fast, and 

2 — ^ 

so (1-/3(1 — (m*) ) ~ 1 — c ^ sh p tends to 1 exponentially fast. From this it is plain to see that 
in that case the right hand side of (4.11) is of the order of /3 2 / cosh 2 (/3(m*(l — a)) which tends to 
zero exponentially fast as /3 j oo. 



(4.5) is trivial and (4.6) follows from Taylor's theorem with second order remainder and (4.8). 







We would like to use the bounds from Lemma 4.1 in (4.2), and preferably the sharper bounds 
(4.3) and (4.4). The problem here is that even under smallness conditions on v we cannot be sure 
that for all i the quantities will have modulus smaller than a = rm*. We will first show how 

to deal with this for the lower bound. The proof of the upper bound will be similar but slightly 
more involved. 
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We get from Lemma 4.1 for $(m) the lower bound 
$(m) - 4>(m*) > -\{y, Bv) - m* (v, 

+ Jn [1 " " (™*) 2 )] (1 " c(/3,r))^ %fc,,,)|<a}(&,*) 2 
= -|(w,5t;) -m* 

+ ^ [1 - /3(1 - (m*) 2 )] (1 - cC/J.r))^^,*) 2 



AT 



i=l 



N 



i=l 
N 



(4.12) 



^ [1 - 0(1 - (m*) 2 )] (1 - c(/3,r))^ %fc,„)|>a}(&,*) 2 



i=l 



\{v,[l-{l-c_{(3,T))A]v)-m* (v,z 



AT 



i=l 



where we have set [l — /3(1 — (m*) 2 )] (l±c(/3,r)) = c±(/3,r) The last line in this bound is the only 
difficult one to treat. We set 



AT 



(4.13) 



i=l 



Our problem will be to estimate the supremum of this quantity over all v in some ball. This problem 
is reminiscent to what we did in Section 2 when we estimated norms of the matrices A, and we 
will solve it in a very similar way. As we will see in the process of our analysis, we will also have 
to consider simultaneously the related variables 



AT 



y »( W ) = AT S S {I(^> «)l>°} 



(4.14) 



i=l 



As a starting point, we need estimates on the size of these random variables for fixed v. They are 
given by the following lemma. 

Lemma 4.2: Let be independent centered Bernoulli random variables. Define p a (x) = 
2exp(-^Y Then 



IP [X a (v) > xN] < exp (N 



2^ Pa (\\v\\ 2 ) 



mi 



and for x > p a (\\v\\ 2 ) 



IP [Y a (v) >x]< exp -N 



(* -Pa(N| 2 )) 2 
3p a (\\v\\ 2 ) 



(4.15) 



(4.16) 
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Proof: We begin with the proof of (4.15). By the exponential Markov inequality we have that for 
any positive t, 



IP 



N 



X}(&» W ) 2l {|(fc,iOI>a} > xN 



.i=l 



< e 



-txN 



jE e t tti, v ) 2l i\(ti.*)\>°-} 



N 



(4.17) 



To estimate the Laplace transform, we write 



<l + IEe^'l miiV) \ >a} 

h\Ui,v)\>*} 



(4.18) 



< exp 



For t < 1/2||t;||2 we have that 



IEe<^'l miiV) \ >a} < 2IEe t ^\ {{ii>v)>a} 



< 2inf e~ sa JE 

s>0 



2inf 



J ^=exp (-^ + [V2tz + s}(&,v) 



2 1 1 |2 

lb 



exp ; -sa + 2(1 _ 2f|H , 2 . 



(4.19) 



>0 y/l - 2t\\v\\l 

2 exp (-^(1-2^1^112) 



yjl - 2t\\v\\\ 

Setting t = 4 ||^||2 we obtain (4.15). 

To prove (4.16) we use again the exponential Markov inequality to get that for any positive t 



IP[Y a (v) > x] <e~ txN [We* 1 



{|(£i. »)!>«} 



i N 



-_ e -**N [(e t -l)IP(\(t 1 ,v)\>a) + i\ 



N 



Now 



]P(({i,v) >a)<e 2|HI 2 
Thus, since (£i,v) is a symmetric r.v. we get for x > p a {\\ v \\2) 

IP [Y a (v) >x}< inf exp {-N [tx - In ((e* - 1K(|M| 2 ) + l)] } 

= eX P (- NI Pa(.\\vhXx)) 

where I p (x), for p £ (0, 1) is the well-known entropy function 
I p (x) = 



(4.20) 



(4.21) 



(4.22) 



zln f +(l-a01n ±ff , if x G [0, 1] 



(4.23) 



oo 



, if x > 1 
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Finally, we use that (see [BG1]) 



I P (x) > 



(x-p) 2 
3 P 



to arrive at (4.16).^) 

Just as in Section 2 we can extract trivially bounds over lattice suprema. We get 
Lemma 4.3: Under the hypothesis of Lemma 4-2, 



IP 



sup X a (v)>4p 2 2^p a (p) + a(ln(p/r) + c) 



< e -N(c-l) a 



and 



IP 



sup Y a (v) > p a (p) + ^3p a (p)a(ln(p/r) + c) 



(4.24) 



(4.25) 



< exp{-JV(c- l)a} (4.26) 



Proof: From Lemma 2.1 we have that 



IP 



sup X a (y) > xN 



< e aiV(ln + f + 1) gup jp 

v: II " II 2 <P 



N 



J2(ti> V ) 2 ' S -{\(li i ,v)\>a} > XN 



.i=l 



(4.27) 



We use Lemma 4.2 and choose x sufficiently large that the resulting probability offsets the expo- 
nential prefactor. For this we set 



x = Ap 2 



2VPa(p) + *Qn(p/r) + c) 



(4.28) 



This gives (4.25) immediately. (4.26) follows in the same way. () 



Now let D C IR M be any bounded domain. Our aim is to get estimates on quantities like 
su Pj;eD Y a (v). As in Section 2 we note that v £ D can be represented in the form 

oo 

v = J2v n (4.29) 

n=l 

with v n e W M ,rS r n-i) = Win) for n > 1 and v\ G D H W M , ri - 

The following observation is crucial: 
Lemma 4.4: Let a\ = a — d\, d\ <C a be positive real constants. Then 

X a ( Vl +e) < X ai ( Vl ) + 2^X ai ( Vl )^(e,Ae) + 2a 2 Y dl (e) + 3 (e,Ae) (4.30) 
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and 



Y a {v 1 +e)<Y ai {v 1 ) + Y dl {e) 



(4.31) 



Proof: The proof is based on the trivial observation that 
{m,(v 1 + e))\>a} 

= {m,v 1 )\<a 1 }n{\(U,(v 1 +e))\> a}U{\(U,v 1 )\> a 1 }n{\(U,(v 1 + e))\> a} (4.32) 
C {m,vi)\ < ai}n{|(&,e)| > di}U{|(^,i;i)| > 0l } 
This gives, 

S {|(£i,K+0)l>a} (&» ( w i + e )) 2 ^ ^K^OI^i} ((ft. w i) + (&» e )) 2 

+ S {l(£i,"i)l<''i} :II {l(£i,^l>di} ((6,^1) + (6,e)) 2 

^ 1[ {|(£ i ^i)|>a 1 }(6, ^l) 2 + (&, e) 2 + 21[ {|(£ i , l ; 1 )|>a 1 }(6, ^l)(6, 

+ 2(6, e) 2 + 2a?S{|( £i , e)^} 

(4.33) 

where some of the indicator functions have been dropped carelessly, and the inequality (a + b) 2 < 
2a 2 + 2b 2 was used in the term that we anticipate as being small. Performing the summation over 
i and using the Schwarz-inequality in the second term we arrive at (4.30). (4.31) is simpler and 
follows in the same way. Q 

Corollary 4.5: Assume that D C IR M is sufficiently regular s.t. D C Uxcw M nD^i( a; )' 
where B r (x) is the ball of radius r centered at x and set B r = B r (0). Then 

supX a (i;)<( / s^p" X ai ( Vl ) + n^Ajl) + 2a 2 sup Y dl (e) + 2r 2 \\A\\ (4.34) 

v£D \y viCW M , ri nD J e£B ri 

and 

supY a (v)< sup Y ai ( Vl )+ sup Y dl (e) (4.35) 

vED v 1 CW M ,r i riD e£B ri 

Proof: This is an immediate consequence of the previous considerations and the fact that 
su Pees r ( e ) ^- e ) — r i ll^-ll by the definition of the norm. Q 

Clearly, the representation of the supremum can serve as a starting point for an iteration. The 
norm of the matrix A has been estimated in Section 2 and we know that it is close to one (for small 
a) with probability exponentially close to one. The supremum over Wm,t x is a lattice supremum 
and has already been estimated. The remaining term is a supremum over a much smaller domain 
as before, and by repeated application of (4.35) will be shown to be very small. We formulate this 
in the next lemma. 
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Lemma 4.6 

IP 



sup X*(v)>rl ( 2 A /2 v /p ai (r ) + a(| In — | + c) 



2ri||A|| + 2^2/ 



< e-^C-" 1 ) + iP 



sup Y" dl (e) > y 



(4.36) 



Proof: This is an immediate consequence of Corollary 4.5 and Lemma 4.4.^) 

For the last term in the bound (4.36) we get the following 
Lemma 4.7: Suppose that 11 < c?i 



IP 



sup y dl (e) > Vv=)' U L-m 2 ^ 2 + y 3 a(|lna|+C)) 



< 2exp{-iV(C - l)a} 



3 



(4.37) 



Proof: By the same type of considerations as above, using in particular (4.35), we get that 

oo 

sup ^(e) < V sup Y bk {e k ) (4.38) 

where is some decreasing sequence of positive numbers that satisfies Y^k=2 ^k = d\- Note that 
|W(A;)| < e aN ^ ln r k and so, using Lemma 4.2, we have 



IP 



sup Y bk (e k ) > _p bfc (r fe _i) + W3p bfc (r fe _i)a ( | In ^-^-| + ( fe 
e k CW{k) V V ?"fc 



< e -^(^-i) (4.39) 



At this point one can make some reasonable choice for the parameters r^, b k and We will set 



k — 1 

r k = a ri 



6fe = a (fe-2)/2 rfi(1 _^ ) 



k 2 

(k = c+ — 
aN 



(4.40) 



To simplify our expressions we will assume in the sequel that 

exp(-^ (1 -^)<i 
and that a < 1/2. Then by a straightforward computation 



(4.41) 



E 

fe=2 



Pb^k-i) + W3p bit (r fe _i)a | In 



r k -i 



Ck 



< e -m 2 ^ 2 [2 {e"^) 2 ^) 2 + y 3 a(|lna| + l)} + ^) 



(4.42) 
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Form this, the Lemma follows immediately from the observation that the probability that a sum of 
r.v.'s exceeds a given sum only if at least one of the r.v.'s exceeds the corresponding summand. Q 

Proposition 4.8: Define 



T(a,a,p) = \2y2V2e d-y^) 2 + a (| ma | + 2) + a^l + r(a) +2a 2 (l + r(a)) 



a[2e "" 2 + 2^/3a(\lna\ + 2) 



(4.43) 



Then 



IP 



sup X a (v) > p 2 T(a,a,p) 

v£B p 



< e~ aN +IP[\\A-1\\ > r(a) 



(4.44) 



Proof: The proposition is just a combination of Lemma 4.7 and 4.8 and a somewhat arbitrary 
choice of r\ and d\. If we set a\ = (1 — A)a, d\ = Aa and T\ = ap. Then 



IP 



sup 

v£B p 



X a {v)> p 2 \2^2^e ^ +a(|ln a| + c) + a-\/||A|| ) +2a 2 / o 2 ||A| 



+ 2(l-A) 2 a 2 e-*(^) 2 ( 1 -v^) 2 ( 2c -£(t)'(i-VS)^ + 2 y 3 a(| In a| + c) 
< 2e - iVa ( c - 1 ) 



(4.45) 



Finally, we may choose A in such a way that ^(1 — \fo) 2 = 1 and this together with the estimate 
on the norm of A from Section 2 gives the proposition. Q 

We combine the previous results to get the desired lower bound 



*(mV +v)- </>(m*) > \(v, B-(p)v) - m*(v, z (1) ) 



(4.46) 



with 



B_(p) = c_(/3,t)I + (I - A)(l - c_(/3,r)) - c_(/3, r)T(a, a, p)l 



(4.47) 



We turn now to the derivation of the corresponding upper bound. The strategy to use will 
depend on the value of /3. If /3 > 1.5, then m*(/3) > 0.5 and little is lost if we use instead of (4.6) 
the rougher estimate 



</>(m* + z) - </>(m*) < — 



(4.48) 
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This then yields 

$(m) - 4>(m*) < -\{v,Bv) - m* (v,z^ 

+ Jn [1 " 0(1 " (™*) 2 )] (1 + <P,r))jf E hlU^Kayi^v) 2 



i=l 

N 



2N'52 1 {\Ui,'>)\>°}(Zi> V ) 2 
»=1 



(4.49) 



= ± (v, [1 - (1 - c + (/3, r))^) - m* (t,, 
+ ±X„(t;)[1- C+ (/3,t)] 
From the previous estimates on X a (y) it then follows immediately that 

$(m*e 1 + v) - (j){m*) < \{v, B + {p)v) - m*(v, z^) (4.50) 



with 

B+(p) = c+(/3,r)I+ (I - A)(l - c + (/3,r)) 



(4.51) 



+ l[l-c + (P,T)]T(a,a,p) 
For /3 close to 1, this estimate is not very good. This can be seen from the fact that in the difference 
between 5_ and B + there occurs a term that is not proportional to (m*) 2 . To remedy the situation 
we must proceed more carefully with the term tanh 2 /3(m* + \z\) in (4.6), taking advantage, on the 
other hand, of the fact that /3 is now strictly bounded. 

Thus we replace (4.49) by 

$(m) - 4>(m*) < -\ (v,Bv) - m* {v,z^ 

+ Jn [1 " 0(1 " (™*) 2 )] (1 + <P,r))jr E hlU^Ka}^) 2 



N 



i=l 

N 



(1 " P) Jn E %fc.*)l>a}(&> v ? + 2 X ™*k{v) 

i=l 

N K-l 
^E^K^.")^-}^^) 2 E S { m *fe<l(£ i ^)l<(fe+l)m*} tanh2 (/ 3 ( m * + 



i=l k=0 



< \ (v, [1 - (1 - c+(/3, t))A]v) - m* [v, 

K-l 

E tanh 2 (/3(m*(A: + l)))X km .(v) + §X m . K (v) 



_ ,. ..,..,2/ 

K-l 

+ £ 
' 2 

fe=l 

(4.52) 

With our previous bounds, we can replace the various X(y) by the bounds from Proposition 4.8 
on their suprema over v with given norm . To simplify the resulting expressions, we will use that 
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for a < 0.1, 



T(a, km*, p) < 8^2 exp ^-(1 - ijaf ^ + a [4| In a| + 10] 

Moreover, we bound tanh 2 (/3(m*(Ai + 1))) < (5 2 {m*) 2 {k + l) 2 . Thus we can bound 



(4.53) 



K-l 



sup ^ tanh 2 (/3(m*(A; + l)))X km ,{v) 

v:\\v\\ 2 <p k=1 



< (5\m*) 2 p 2 J2(k + l) 2 8V2exp (-(1 - 2^) 2 ^^) + ocK [4| lna| + 10] p 2 

( (m*) 2 

< j(m*) 2 p 2 K^/ol[A\ In a | + 10] + 208V2 / o 2 (m*) 2 exp ( -(1 - 2^fa) 2 ^-^- 



(4.54) 



where the numerical constant in the last bound was obtained under the hypothesis that a and p 
are such that exp f-(l - 2^/a) 2 ^f^-) < 1/2. Finally, K must be chosen such that both 



Ky/a[4\]n.a\ + 10] < 1 



and 



S^exp (-(1 - 2^) 2 ^^) < ip\m*) 2 



(4.55) 



(4.56) 



It is easy to check that this is the case if 



K 



2p 



m 8V2 



m*(l - 2^/a) 

Combining everything, we see that we get again the upper bound (4.50), but this time with 

B+(p) = c+(/3,t)1+(1-A)(1 - c+(/3,r)) 

+ (m*)W((l - /3)/(m*) 2 + (5 2 ) V{a, a,p) + 2 7 + 300 exp (-(1 - 2^) 2 ^ 



(4.57) 



(4.58) 



We summarize the results of this subsection in the following theorem. 

Theorem 4.9: There exists a set Q C of measure one such that for all but a finite number of 
values N, for any < p < 1 and for all \\v\\2 < p, 



and 



$ Ar , /3 H(m*e 1 + v)- </>(m*) > \{v, B-(p)v) - m*(v, z (1) ) 



$ iV , /3 H(m*e 1 +v)- <p(m m ) < \{v, B + {p)v) - m*(v, z™) 



(4.59) 



(4.60) 



where B_(p) is defined by (4-47) an d B+(p) is given by (4-51) if ft > 1.1 and by (4-58) if 1 < /3 < 
1.1. 
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Proof: This theorem follows simply from our previous estimates and using the Borel-Cantelli 
lemma. Q 



4.2. Localization of the minima 

Theorem 4.9 contains the main information needed for the analysis of the structure of the 
minima of the function $. As we will explain later, it also serves as a starting point for a more 
refined analysis of that function. 

Theorem 4.10: There exist finite positive constants c\, c 2 , C3 such that the following holds for 
almost all oj for all but a finite number of values N : If +Ja < c\ (m*(/3)) 2 then for all v such that 

c 2 ^^<\\v\\ 2 <c 3 m*((3) (4.61) 

and for all (/x, s), 

$ Ni/ 3[w](4m*e" + v)> </>(sm*e") (4.62) 



Remark: Theorem 4.10 establishes the existence of a local minimum at a distance of order ^"Tia*^ 

c-(/3,t) 

from the points sm*e M . We will soon localize them more precisely. This is a generalization of the 
results of Newman [N] and Komlos and Paturi [KPa] to finite temperatures. If we consider the 
asymptotic regime where /3 ~ /3 C = 1 we have that m*(/3) ~ — 1 and c_(/3,r) ~ /3 2 — 1. The 
condition on a is then of the form a < c(/3 — l) 2 and for sufficiently small c\ the upper bound is 
seen to be a multiple of the lower one. Notice that this behaviour of the critical a as a function of 
P near one is (up to the constants) the same as the one found by [AGS] using the replica method. 
For large /3, we have checked numerically that the constant c\ can be chosen at least as 0.04. 

Proof: The proof of Theorem 4.10 relies on the lower bound (4.59) from Theorem 4.9 and the 
following estimate on the norm of the vectors z^. 



Lemma 4.11: Let z 



h Si £i£i> f or M / v an d z ^ = 0- Then, for all e sufficiently small 



IP 



> (l + e)v^ 



< e-^ M ' s 



(4.63) 



Proof: Note that for fixed v, zj? are independent for different /x. Moreover, for /x / v and the 
assumption of the Lemma, they are stochastically dominated by independent normal distributed 
random variables z^. The bound (4.63) then follows by a simple application of the exponential 
Markov inequality. () 
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To prove the proposition, we may now choose a in Proposition 4.8 in a suitable way. A 
possible choice is a = m* /3. With this choice c_(/3,r) > f (l - (3(1 - (m*) 2 )). For ||i;||2 satisfying 
the upper bound in (4.61), we can make the terms e K " 2 appearing in T(a, a, p) as small as desired 
by choosing C3 small, while the terms ailnp/r + 2) can also be made small by choosing c\ small, 
and also all the terms of order ^/a are small compared to c_(/3,r) under the assumption on a. 
Thus we get effectively a bound 

$(se" + v)- </>(m*) > \c-(P,t)\\v\\1 - v^(l + e)|M| 2 (4.64) 

and this is strictly positive if v satisfies the lower bound in (4.61). Moreover, the lower bound in 
(4.61) is smaller than the upper one if c\ is sufficiently small, so that our statement is not void. 

00 

Theorem 4.10 will sharpened in the sense that we can locate more precisely the position of the 
true local minima. 

Lemma 4.12: For all p sufficiently small such that B_(p) is strictly positive we define = 
m* B + (p)~ 1 . Then for all p, and for all v such that 

\\v-v^\\ 2 > m^U+e)!!^ 1 !! (y/\\B-{p)-i\\ \\B+(p) - B_(p)\\ + 2\\B_(p)- 1 \\ \\B+(p) - B_(p)\\) 

(4.65) 

and \\v\\2 < p one has that, for almost all oj, for all but a finite number of indices N , 

$JV,/3M(e"m* +v)> ^Ar^M^m* + v^) (4.66) 



Proof: This lemma follows from Theorem 4.9 by some elementary algebra and Lemma 4.11. 

It remains to estimate the various norms appearing in (4.65). This is an elementary, but 
somewhat painful, exercise and we will just consider the two asymptotic regimes /3 J, 1 and /3 j 00. 
We collect these bounds, which are easily obtained from our previous estimates without going into 
the details of the proofs. We also, for sake of clarity, take the liberty to throw away all insignificantly 
small corrections. 

Lemma 4.13: Let us put *J~ol = j(m*) 2 . Then we have for 1 > r > to be chosen later: 

(i) To leading order in the limit ft [ 1 

q 1 1 
" + " " " " " " 2(m*) 2 1 - r(r + l)/2- 37 -(l-r(r+l)/2)r(rm*, a, /)) l ' ' 



and 

115. - B. 



< (m*) 2 (^3^ + 4T(a,rm*,p) + 2 7 + 300 exp (-^)) (4-68) 
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(ii) If f3] oo, 



and 



lif-i II < || 5 -i|| < ± 

1 + 11 " 11 " 11 " l-T(T,a,p) 



\\B + -B_\\ < T(r,a,p) 
and to leading order in a, for p = ^ ] /a l 



(4.69) 
(4.70) 



r(r,a, / o)«8^exp^-^-(l-2^) 2 ) +a[|lna| + 2] (4.71) 



Note that by Theorem 4.10 we only need to consider the ball of radius p = c 2^p7jg~7)- In case 
(ii) we can choose r arbitrary close to one to get the result 



\B + - J B_|||| J BZ 1 || ~ 8^2exp (-^"C 1 ~ 2 V")^ + 2a[|lna| 



2] 



(4.72) 



In case (i) we still have to make our choice for the parameter r. Note that in that case c_(/3,r) ~ 
|(m*) 2 (l — t(t+ l)/2) and, anticipating that we will chose r small, we have p ~ ^-jm*. Inserting 
this value in the bounds (4.67) and (4.68), we get 



15+ - B\\\\BZ 1 \ 



t(t + 1) + 32^2exp 



9r 2 



16c 2 7 2 



27 + 300 exp 



16c 2 7 2 



(4.73) 



With the natural choice r 2 = —k 2 -^ 2 \ ln7| this gives 



\B+ - J B_|||| J BZ 1 | 



c 2 7V|ln 7 | + 2 7 + 0( 7 2 ) 



(4.74) 



If we notice further that the dominant part of the matrix B + is a multiple of the identity, we arrive 
at the following 

Theorem 4.14: For any {3 > 1 set 6(/3) = 1 _ j g™_f(„[»)2) • There exists 70 > such that for all 
7 < 7o ; for almost all oj, for all but a finite number of indices N , the following holds: for all v such 
that \\v\\2 < c^m* and 

v-b((3)z M > c 4 m*7 3/ V|ln7| (4.75) 
2 

$j V , /3 H(m*e M + v) > inf § N tP [uj][m* + v) (4.76) 

|M| 2 <c 3 m* 

for some strictly positive constant C4 and where C3 is the same constant as in Theorem J^.10. 

This theorem allows us to locate quite precisely the (random) position of the lowest minimum 
of the function $ in vicinity of any of the points m*e M . It is of interest to observe that in smaller 
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regions these minima are even unique, i.e. there are no other local minima in the immediate vicinity 
of the 'Mattis states'. This is the main content of the last theorem of this section. 



Theorem 4.15: Assume that 1 < /3 < oo. Then there exists «o(/3) and p({3) such that if 
a < a o(/3) ; with probability one for all but a finite number of indices N , § ^ t p[uj](m* e 1 + v) is a 
twice differ entiable and convex function of v for all v with \\v\\2 < p(P)- 

Proof: The differentiability for fixed N is no problem. The non-trivial assertion of the theorem is 
the local convexity. We have that 

AT 



i=l 
N 



1 - A + if E ^-{\tti,-»)\<™*}4>" ( m * + (Zi,v))Ztii 

i=l 

N 

£ £ ,v) I >Tm* } 4>" ( m *£i + (ti,v))&i 



(4.77) 



i=l 



The point here is that 



</>"(z) = 1-/3(1- tanh 2 (/3z)) 



(4.78) 



so that (j)"(x) > c if \x\ > ^ tanh 1 (^y 1 — ^q^J • Moreover, for arbitrary x we have that (j)"(x) > 
1— P- Thus if we set r = tanh -1 I y^~j^ ) ~ 1) denoting by \ m in (D 2 </>(m* e 1 + i;)) the smallest 



eigenvalue of D <j>[m*e + v), we get that 



A min (DV(mV + v)) > 1 - (1 - c) p|| - (/3 - 1 - c) 



AT 



(4.79) 



Denoting by \ m in (D 2 (f>(m*e 1 + i;)) the smallest eigenvalue of D 2 (f>(m*e 1 + i;) What we need to do 
is to estimate the norm of the last term in (4.79). Now, 



sup 

v£B p 



N 



i=l 



N 



SUp SUp j2jfJ2'S-{M i ,v)\>TTn*}(.U,w) 2 



i=l 



N 



(4.80) 



< X sup sup ^E%^)l>™*}(&>™) 2 
Using the trick to write 

^{Ui,v)\>rm*}{ii,w) 2 = l {Uuv)l>Tm , } ($i,w) 2 (l muw)l<Uuv)l} + ^{\Ui,w)\>\Ui,v)\}) 
^ M\tti,v)\>™*}(.U,v) 2 + l{\(Z itW )\ >Tm *}(£i,w) 2 

so that 

AT 

AT E S {l(£i,")l>™*}(^'' !i; ) 2 = X Tm*(v) + X Tm *[w) 



(4.81) 



(4.82) 



i=l 
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by which token we are reduced to estimate the same quantities as before. We obtain therefore on 
Qi for all v with norm less than p, 

A min (D V(mV + v)) > 1 - (1 - c)(l + r(a)) - (0 - 1 - c)r(a, rm*, p) (4.83) 

which proves the theorem and allows to estimate the constants involved.^) 

Remark: Note that the estimates derived from (4.83) become quite bad if /3 is large. This is due 
to the fact that the second derivative of </> satisfies a poor uniform bound in this case. However, 
this bad bound is realized only in a small region, so that a more careful analysis should allow to 
replace (1 — /3) by a bounded constant. 



4.3. The macroscopic component of the minima near the 'Mattis states' 

We have seen so far that the location of the minima of $ is shifted away from the 'Mattis states' 
m*e M by a random vector up to error terms of small norm. The components of are all 
"microscopic" i.e. of order [AGS] found, on the basis of the replica method that the location 

of the minimum associated to the pattern p undergoes a macroscopic shift of order exp (—j^) of its 
p-th component. We will show that from Theorem 4.14 such a result can be derived in a rigorous 
form. Without restriction of generality, we consider a minimum with (p = l,s = + 1). We denote 
the 1-component of the location of a minimum according to Theorem 4.14 by m 1 (JV) and set 

m\ = lim sup m 1 (N) (4.84) 

JVf oo 



and 



m}_ = liminf m^JV) (4.85) 

ATf oo 



Theorem 4.16: Assume that a satisfies the hypothesis of Theorem 4-14- Then there exists a 
finite constant c$ such that, IP -almost certainly, 

1 f e 2 i 
m\ < —= \ e _£ 5~ tanh/3(m^ + ^ym* + ^/am*x)dx + c 5 ^/a\ lna\e ^ 2 M"-rl (4.86) 

V27T J 

1 f E 2 1 
n\_ > —= \ e~^ tanh/3(m 1 : + s/jm* + ^/am*x)dx - c 5 ^/a\ lna\e ^l 1 "^ (4.87) 

V27T J 



and 
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A special case of this Theorem is the following 
Corollary 4.17: In the limit ft j oo ; IP-almost surely 

m\ < Erf(^j^j + c 5 Va|lna|e"^M^T (4.88) 

and 

m\ > Erfl ^-^f 1 ^ - c s </a\ In a\e~ *- 2 M--l (4.89) 
where Erf[x) = J* dte~ l is the error function. 

Remark: The bounds (4.89) can be evaluated numerically, but it is clear that (4.88) implies that 
<l-0 (e" 1/a ) and that for a small enough there exist ml of the same order which verifies 
(4.89). A numerical analysis of these inequalities shows that solutions near 1 exist up to values of 
a of order 0.1, much larger than those for which the hypothesis of Theorem 4.16 can be proven. 
Corollary (4.17) should be compared to the heuristically derived set of equations (4.5-7) of [AGS], 
namely m 1 = Erf(m 1 /\/2ar), where r = (1 — C) -2 and C = ^Jlj'Kar exp( — (m 1 ) 2 /2ar). They use 
these to determine the critical storage capacity by finding the maximal value a for which a non-zero 
solution exists. The inequalities of Theorem 4.16 compare with the equations (5.5,6) of [AGS]. 

Proof: Let m be any minimum of $ in the ball B Cilh ^^ (m*(/3)z^ 1 ^) . Then, it must be a solution 
of the system of equations 

N 

m " = £E^ tann ^' m )] > (i=l,...,M. (4.90) 

i=l 

Now we can write m = e 1 m 1 -\-m*z^ +u> where w\ = and ||u»||2 < C4^6(/3)-^/a. Then the equation 
for the component m 1 reeds 

AT 

ml = jf Z) tanh [0( ml + m *(£> z(1) ) + (L w ))] ( 4 - 91 ) 

i=l 

where £j = For any a > we can write 

AT 

m 1 ^^tanh^m 1 + m *(£, z«) + a )] 

i=l 

+ ^ %^)l>a} [ tanh [0( ml + m *(&> + (&> ™))] " tanh[/3(m 1 + m*(&, z^) + a] 

i=l 
N 

<jjJ2 tanh[/3(m 1 + m*(&, z (1) ) + a)] + y a (™) 

(4.92) 
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where Y a (w) is defined in (4.14). Similarly 

N 

m 1 > ± ^HKrn 1 + m*(&, z™) - a)] - Y a {w) (4.93) 

i=l 

We should expect that the averages over i in the formulas (4.92) and (4.93) converge to expectations 
with respect to some measure. This is indeed the case due to the following lemma. 

Lemma 4.18: Let 8 X denote the Dirac-measure concentrated on x and let A/o, a & e the centered 
Gaussian measure with mean zero and variance a. Then, 

N 

w ~ AT n E %,^>) = Mo,* IP-a.s. (4.94) 



We will give the proof of this Lemma in the appendix. 

We recall further that the quantity sup^^ Y a (w) is known from Lemma 4.7 and, by a simple 
application of the Borel-Cantelli Lemma we obtain that, almost certainly, 

lim sup Y a (w) < f(a, a,m*,j) (4.95) 

where 

2 1 



rrij 

and 

m 



(4.96) 



^■■-•■i)= n '|-i(^fc) 

Putting these observations together, we find that, almost surely, 

1 f 2 ~ 

\ < / ctae'fe tanh[/3(mi +a + m*x)] + T(a, a,m*,j) (4.97) 

V27ra J 



If 2 ~ 

\_ > / rfcce"^ tanh[/3(m 1 : - a + m*x)] - T(a,a,m*,j) (4.98) 

V27ra J 

Choosing a = y^m* we obtain from here the claims of the theorem. Q 
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5. Applications to the Gibbs measures: Proof of Theorem 3 

Theorem 3 follows from the estimates in the last two sections in a fairly straightforward way 
along the lines of [BGP1] and [BGP2]. We only give a rough outline in order to avoid repetitions. 
In particular, we will only show how the results are obtained for the measures Q and leave the 
remaining step that can be copied from [BGP1] to the reader. To simplify our notation, let us set 
Bp^ = B p (m*e li ) and R p = {U( MiS )5 p (se M m*)} c . Let us also introduce the integrals 



and 



Jn 



hp 

L 



d M ze -f>N{${z)-cl>{m*)) 



d M ze -t3N{${z)-<t>{m*)) 



(5.1) 



(5.2) 



Note that by symmetry, changing Bp^ to B p ( — m*e M ) in (5.1) does not change Ip^K To simplify 
our presentation, we will denote by the subset of Q on which our various bounds on $(m) from 
Sections 4 and 5 hold. All bounds stated in this section are true on O2; recall that the probability 
of Q 2 is exponentially close to one. 

By Theorem 4.9 we have that 

jM > ( d M ye 



/3AM -^{v,B+{p)v)-m*{v,z 



(5.3) 



Using that for p > Am*\\z^ || 2 \\B^ (p)\\ 



d M ve 



/3AM ^{v ,B + {p)v)-m* {v ,z 



< 



V\\2>P 



/3AT 



M/2 



»\\B+\p)\\ 



we get 



^\\B-\p)\\ 
/3AT 



M/2 



/3AT 



l-2 M e 8 H B + 1 ( p )ll 



(5.4) 



(5.5) 



Using in addition to Theorem 4.9 the lower bounds on $ from Section 4 we get on the other hand 
J P < [ d M ze^{-\f3Nc{m*)\\\z\\ 2 -m*f) 
d M zexp (-(3Nc 2 (m*) 4 ) 

(5.6) 



f t 4 m* ,1/3B 



<J \\v\\ 



d M ve 



/3AM ±(v,B-v)-m*(v,z 



< (m*) M y M exp {-Nf3c(m*) A ) + 2M 



Sir\\BZ 1 (p)\\ 
~P~N 



M/2 



8||B_ (f) 
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where Vm = r(M/2) denotes the surface area of the M- dimensional unit sphere and for some 
constant c > 0. Now choose p = c^^fajm* = c^jm*. Set further H-B^Oo)!! = c±/(m*) 2 . Then the 
above estimates combine to 



J 



M/2 _ PN 1 2 cl(m*) 



8c_ 



(5.7) 



< e 



-(3Mc 7 



where C7 > is some constant depending on C5, j a and c + /c_. Since clearly 



(5.8) 



Q(R Cs -ym* ) obeys the same bound. 

Next, to prove the second statement of Theorem 3, observe that 



In' 



r(f) 

l _p 



In Art -IE In Art - In A^ - IE In A"'- 



Noticing that the function $jv i( g[u;](,z) satisfies the Lipshitz bound 



(5.9) 



(5.10) 



we can again use Theorem 2.5, without this time, using its full power, given that the Lipshitz 
constant is bounded uniformly. This implies that for all x > 



IP 



^ (in J(") - IE In A*) >x 



-N- 



< 4e 32(m* ) 2 



(5.11) 



To complete the proof of Theorem 3 we show that with regard to the objects we consider, the 
measures Q and Q differ only by exponentially small terms. More precisely 



Lemma 5.1: Assume that a < 7 2 (m*) . Then on the set f^j 



< e 



-c s (3M 



(5.12) 



Proof: From the fact that Q is the convolution of Q with the Gaussian measure of mean zero and 
variance /3iV it follows that 



Q(BM) < S(<L m .) + 2 M e-\^^'? 



and 



Q(BM) > Q{B^} S ,) - 2 M e-^ N ^ s ^™? 



(5.13) 
(5.14) 
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On the other hand, 



4c_ 



M/2 t3N-t 2 { m *) l {c s -S) 2 

e 8 *+ 



(5.15) 



by the same type of computation than the one leading to (5.7). Choosing 8 = cgm* with eg > 1, 
we obviously get (5.12) with eg depending only on C5 and c + /c_. Q 

From Lemma 5.1 and (5.7) and (5.8) follows the first assertion of Theorem 3. The second 
follows from (5.9) and (5.11), provided 



-PMc s 



sup 



as JV I 00. But clearly 



» s(4 M) ) 



I a.s. 



2MmUI<f ) /I<f ) + J p /I<f ) 



(5.16) 



(5.17) 



The second term in the denominator is exponentially small by (5.7) while by (5.11) 



IP 



2Minf i<"V4' l) ^ 2MeP Mc */ 4 



-,SM(m* ) 2 c\ 

< 4Me 512 



(5.18) 



From here we get (5.16) and this concludes the proof of Theorem 3. 
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Appendix: Proof of Lemma 4.18 



We introduce the abbreviation = X^N) = ^7= Y^jLi £j- Lemma 4.18 can then be written 
in the following form 

Lemma A.l: Let 8 X denote the Dirac-measure concentrated on x and let A/o, a & e the centered 
Gaussian measure with mean zero and variance a. Then, 

N 

W ~ SfL ^H 8 {ii,X)/VN-M/N =- /V 0,a IP-a.S. (6.1) 



i=l 



Proof: To prove weak convergence, it is enough to prove the a.s. convergence on a measure 
determining class. The main step in the proof is thus the following lemma. 

Lemma A. 2: Let f G C^ 2 \LR) be an increasing bounded function with bounded first and second 
derivatives. Then 



N 



0, IP-a.s. 



i=l 



(6.2) 



Proof: Use the exponential Chebeychev inequality to get that 

AT 

f (i 

Vn 



IP 



LEf I > 6 



i=l 



-ATte 



N 



i=l 



X = x 



IP[X = x] 



(6.3) 



e-»» E ^ exp (t £ / (^R) - LEf ( ^ ) ) IP [X = x] 

X \ »=1 / 

where the last equality defines IE X . The crucial point is now that the variables £j under the law 
IE X are negatively associated (see e.g.[JP,Lo]) and therefore 

~ ( N N 



AT 



<E JP ^ = -1II S .=P(VW)- JB /W 

x i=l 

< e ^ ^ = *] n + '[^-/ f^) - m f 



Vn 



IEf[^j^-)\-e 



i=l 



(6.4) 
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where we have assumed, without loss of generality, that < 1. By the same hypothesis, the 

term proportional to t 2 in the last line is bounded by a constant, and we would immediately be 
done if the term proportional to t was zero. While this is not exactly true, we will see that this is 
virtually true on a set of values x which carry all but an exponentially small mass. Let us define 

It is not difficult to show, using for instance the Yurinskii-martingale technique [Yu], that F satisfies 
a concentration estimate. 

Lemma A. 3: Le F be defined by (6.5) and let X = ^= Y^iLi £*• Then there exists a constant 
< Cf < oo such that for all 6 < a/2, 

IP [\F(X) - IEF(X)\ >8]<c f exp (~n£j) (6.6) 



Proof: Lemma A. 3 is a concentration estimate for F regarded as function of the M independent 
random variables X^. To get it, we will show that the derivative of F with respect to x^ satisfies 
appropriate bounds. We will use that the variables under IE X are independent for different /x 
and that 

2Mtf = ±l] = |(l±^) (6-7) 



Therefore, for any v we can write 



F[x) = \m x 



Vn 



(6.8) 



This representation allows immediately to compute the derivative with respect to x u , and since /, 
/' and /" are assumed to be bounded, a simple computation shows that 



dx v "^(^O 



< 



C\x„ 
N 



This bound allows to estimate the conditional expectations 

a 



IE 



\X V 



2 t|X„| 

e 



dX, 



-F(X) 



\cr(X 1 , . . .,X„_i) 



(6.9) 



< ^m\XXe tMlN (6.10) 



where <t(Xi, . . . , X„_i) denotes the sigma algebra generated by the variables X\, . . . , X v _\. Noting 
that X^ are close to normal (and recalling e.g. Lemma 2.1), we see that the last expectation is 
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bounded by const. /N 2 as long as 2t/N < 1 These allow the use of the Yurinskii-Martingale method 
(see e.g. [LT]; the specific computations used here will be similar to those in Chap 3 of [BGP2]) to 
prove (6.6). We leave the details to the reader. Q 



As an immediate corollary of Lemma A. 3 we get that except on a set of probability smaller 
that Cf exp( — 8 2 /2cf), 

(6.11) 



( ^ 



mf[ LijB ] \ <8 



Therefore 

£ m. exp (, £ /(^)- mt (Uffi ) I w [x = ,] 

X \ 1=1 / 

< cf exp(-N6 2 /2c f )e tN + exp (+3Nt6 + Nt 2 c ) 
for some finite constant cq. Since for t sufficiently small we can choose 8 2 = 2tcf, we may in fact 



(6.12) 



use 



£ IE X exp ( i ^ / (^p) - iE/ \lP[X = x]<c f + exp (iV[3i 3 / 2 ^ + * 2 c /2] 

X \ 1=1 / 



Inserting this bound in (6.3) and making, for e sufficiently small, the choice 



81c, 



we get that for some finite constant C 



IP 



> e 



»=i 



< exp ( -N 



2e" 



Cf + exp ( JV 



■2 4e 3 
3 81cj 



CNe 4 



(6.13) 
(6.14) 

(6.15) 



which for e sufficiently small is of order exp( — Nee 3 ). In much the same way we can also proof that 



IP 



»=i 



Vat 



< e 



< exp(-JVce 3 ) 



(6.16) 



Form this the lemma follows by the Borel-Cantelli Lemma. Q 

To conclude we have to identify limjv-foo lEf (^7^P) ■ Clearly this quantities are the same for 
all i and 

%P = £££tftf + # (6-17) 



the central limit theorem applied to the independent random variables {^i*^} - >2 ft > 1 snows that 

SfL mf " - ) = Vt; J dzf{z)e~& (6.18) 



This together with Lemma A. 2 implies Lemma A.l. 
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